error correlation
Error Adjustment Based on Spatiotemporal Correlation Fusion for Traffic Forecasting
Liu, Fuqiang, Ding, Weiping, Miranda-Moreno, Luis, Sun, Lijun
Deep neural networks (DNNs) play a significant role in an increasing body of research on traffic forecasting due to their effectively capturing spatiotemporal patterns embedded in traffic data. A general assumption of training the said forecasting models via mean squared error estimation is that the errors across time steps and spatial positions are uncorrelated. However, this assumption does not really hold because of the autocorrelation caused by both the temporality and spatiality of traffic data. This gap limits the performance of DNN-based forecasting models and is overlooked by current studies. To fill up this gap, this paper proposes Spatiotemporally Autocorrelated Error Adjustment (SAEA), a novel and general framework designed to systematically adjust autocorrelated prediction errors in traffic forecasting. Unlike existing approaches that assume prediction errors follow a random Gaussian noise distribution, SAEA models these errors as a spatiotemporal vector autoregressive (VAR) process to capture their intrinsic dependencies. First, it explicitly captures both spatial and temporal error correlations by a coefficient matrix, which is then embedded into a newly formulated cost function. Second, a structurally sparse regularization is introduced to incorporate prior spatial information, ensuring that the learned coefficient matrix aligns with the inherent road network structure. Finally, an inference process with test-time error adjustment is designed to dynamically refine predictions, mitigating the impact of autocorrelated errors in real-time forecasting. The effectiveness of the proposed approach is verified on different traffic datasets. Results across a wide range of traffic forecasting models show that our method enhances performance in almost all cases.
Quantifying Correlations of Machine Learning Models
Li, Yuanyuan, Sarna, Neeraj, Lin, Yang
Machine Learning models are being extensively used in safety critical applications where errors from these models could cause harm to the user. Such risks are amplified when multiple machine learning models, which are deployed concurrently, interact and make errors simultaneously. This paper explores three scenarios where error correlations between multiple models arise, resulting in such aggregated risks. Using real-world data, we simulate these scenarios and quantify the correlations in errors of different models. Our findings indicate that aggregated risks are substantial, particularly when models share similar algorithms, training datasets, or foundational models. Overall, we observe that correlations across models are pervasive and likely to intensify with increased reliance on foundational models and widely used public datasets, highlighting the need for effective mitigation strategies to address these challenges.
Multi-Agent Reinforcement Learning with Focal Diversity Optimization
Tekin, Selim Furkan, Ilhan, Fatih, Huang, Tiansheng, Hu, Sihao, Yahn, Zachary, Liu, Ling
The advancement of Large Language Models (LLMs) and their finetuning strategies has triggered the renewed interests in multi-agent reinforcement learning. In this paper, we introduce a focal diversity-optimized multi-agent reinforcement learning approach, coined as MARL-Focal, with three unique characteristics. First, we develop an agent-fusion framework for encouraging multiple LLM based agents to collaborate in producing the final inference output for each LLM query. Second, we develop a focal-diversity optimized agent selection algorithm that can choose a small subset of the available agents based on how well they can complement one another to generate the query output. Finally, we design a conflict-resolution method to detect output inconsistency among multiple agents and produce our MARL-Focal output through reward-aware and policy-adaptive inference fusion. Extensive evaluations on five benchmarks show that MARL-Focal is cost-efficient and adversarial-robust. Our multi-agent fusion model achieves performance improvement of 5.51\% compared to the best individual LLM-agent and offers stronger robustness over the TruthfulQA benchmark. Code is available at https://github.com/sftekin/rl-focal
The logic of NTQR evaluations of noisy AI agents: Complete postulates and logically consistent error correlations
In his "ship of state" allegory (\textit{Republic}, Book VI, 488) Plato poses a question -- how can a crew of sailors presumed to know little about the art of navigation recognize the true pilot among them? The allegory argues that a simple majority voting procedure cannot safely determine who is most qualified to pilot a ship when the voting members are ignorant or biased. We formalize Plato's concerns by considering the problem in AI safety of monitoring noisy AI agents in unsupervised settings. An algorithm evaluating AI agents using unlabeled data would be subject to the evaluation dilemma - how would we know the evaluation algorithm was correct itself? This endless validation chain can be avoided by considering purely algebraic functions of the observed responses. We can construct complete postulates than can prove or disprove the logical consistency of any grading algorithm. A complete set of postulates exists whenever we are evaluating $N$ experts that took $T$ tests with $Q$ questions with $R$ responses each. We discuss evaluating binary classifiers that have taken a single test - the $(N,T=1,Q,R=2)$ tests. We show how some of the postulates have been previously identified in the ML literature but not recognized as such - the \textbf{agreement equations} of Platanios. The complete postulates for pair correlated binary classifiers are considered and we show how it allows for error correlations to be quickly calculated. An algebraic evaluator based on the assumption that the ensemble is error independent is compared with grading by majority voting on evaluations using the \uciadult and and \texttt{two-norm} datasets. Throughout, we demonstrate how the formalism of logical consistency via algebraic postulates of evaluation can help increase the safety of machines using AI algorithms.
Independence Tests Without Ground Truth for Noisy Learners
Corrada-Emmanuel, Andrรฉs, Pantridge, Edward, Zahrebelski, Eddie, Chaganti, Aditya, Simeonov, Simeon
Exact ground truth invariant polynomial systems can be written for arbitrarily correlated binary classifiers. Their solutions give estimates for sample statistics that require knowledge of the ground truth of the correct labels in the sample. Of these polynomial systems, only a few have been solved in closed form. Here we discuss the exact solution for independent binary classifiers - resolving an outstanding problem that has been presented at this conference and others. Its practical applicability is hampered by its sole remaining assumption - the classifiers need to be independent in their sample errors. We discuss how to use the closed form solution to create a self-consistent test that can validate the independence assumption itself absent the correct labels ground truth. It can be cast as an algebraic geometry conjecture for binary classifiers that remains unsolved. A similar conjecture for the ground truth invariant algebraic system for scalar regressors is solvable, and we present the solution here. We also discuss experiments on the Penn ML Benchmark classification tasks that provide further evidence that the conjecture may be true for the polynomial system of binary classifiers.
Constructing Heterogeneous Committees Using Input Feature Grouping: Application to Economic Forecasting
Liao, Yuansong, Moody, John E.
Yuansong Liao and John Moody Department of Computer Science, Oregon Graduate Institute, P.O.Box 91000, Portland, OR 97291-1000 Abstract The committee approach has been proposed for reducing model uncertainty and improving generalization performance. The advantage of committees depends on (1) the performance of individual members and (2) the correlational structure of errors between members. This paper presents an input grouping technique for designing a heterogeneous committee. With this technique, all input variables are first grouped based on their mutual information. Statistically similar variables are assigned to the same group.
Constructing Heterogeneous Committees Using Input Feature Grouping: Application to Economic Forecasting
Liao, Yuansong, Moody, John E.
Yuansong Liao and John Moody Department of Computer Science, Oregon Graduate Institute, P.O.Box 91000, Portland, OR 97291-1000 Abstract The committee approach has been proposed for reducing model uncertainty and improving generalization performance. The advantage of committees depends on (1) the performance of individual members and (2) the correlational structure of errors between members. This paper presents an input grouping technique for designing a heterogeneous committee. With this technique, all input variables are first grouped based on their mutual information. Statistically similar variables are assigned to the same group.
Improving Committee Diagnosis with Resampling Techniques
Parmanto, Bambang, Munro, Paul W., Doyle, Howard R.
Central to the performance improvement of a committee relative to individual networks is the error correlation between networks in the committee. We investigated methods of achieving error independence between the networks by training the networks with different resampling sets from the original training set. The methods were tested on the sinwave artificial task and the real-world problems of hepatoma (liver cancer) and breast cancer diagnoses.
Improving Committee Diagnosis with Resampling Techniques
Parmanto, Bambang, Munro, Paul W., Doyle, Howard R.
Central to the performance improvement of a committee relative to individual networks is the error correlation between networks in the committee. We investigated methods of achieving error independence between the networks by training the networks with different resampling sets from the original training set. The methods were tested on the sinwave artificial task and the real-world problems of hepatoma (liver cancer) and breast cancer diagnoses.
Improving Committee Diagnosis with Resampling Techniques
Parmanto, Bambang, Munro, Paul W., Doyle, Howard R.
Central to the performance improvement of a committee relative to individual networks is the error correlation between networks in the committee. We investigated methods of achieving error independence betweenthe networks by training the networks with different resampling sets from the original training set. The methods were tested on the sinwave artificial task and the real-world problems of hepatoma (liver cancer) and breast cancer diagnoses.